Ecogenomic Perspectives on Domains of Unknown Function: Correlation-Based Exploration of Marine Metagenomes
نویسندگان
چکیده
BACKGROUND The proportion of conserved DNA sequences with no clear function is steadily growing in bioinformatics databases. Studies of sequence and structural homology have indicated that many uncharacterized protein domain sequences are variants of functionally described domains. If these variants promote an organism's ecological fitness, they are likely to be conserved in the genome of its progeny and the population at large. The genetic composition of microbial communities in their native ecosystems is accessible through metagenomics. We hypothesize the co-variation of protein domain sequences across metagenomes from similar ecosystems will provide insights into their potential roles and aid further investigation. METHODOLOGY/PRINCIPAL FINDINGS We calculated the correlation of Pfam protein domain sequences across the Global Ocean Sampling metagenome collection, employing conservative detection and correlation thresholds to limit results to well-supported hits and associations. We then examined intercorrelations between domains of unknown function (DUFs) and domains involved in known metabolic pathways using network visualization and cluster-detection tools. We used a cautious "guilty-by-association" approach, referencing knowledge-level resources to identify and discuss associations that offer insight into DUF function. We observed numerous DUFs associated to photobiologically active domains and prevalent in the Cyanobacteria. Other clusters included DUFs associated with DNA maintenance and repair, inorganic nutrient metabolism, and sodium-translocating transport domains. We also observed a number of clusters reflecting known metabolic associations and cases that predicted functional reclassification of DUFs. CONCLUSION/SIGNIFICANCE Critically examining domain covariation across metagenomic datasets can grant new perspectives on the roles and associations of DUFs in an ecological setting. Targeted attempts at DUF characterization in the laboratory or in silico may draw from these insights and opportunities to discover new associations and corroborate existing ones will arise as more large-scale metagenomic datasets emerge.
منابع مشابه
Comparative Phylogenetic Perspectives on the Evolutionary Relationships in the Brine Shrimp Artemia Leach, 1819 (Crustacea: Anostraca) Based on Secondary Structure of ITS1 Gene
This is the first study on phylogenetic relationships in the genus Artemia Leach, 1819 using the pattern and sequence of secondary structures of internal transcribed spacer 1 (ITS1). Significant intraspecific variation in the secondary structure of ITS1 rRNA was found in Artemia tibetiana. In the phylogenetic tree based on joined primary and secondary structure sequences, Artemia urmiana and pa...
متن کاملDiverse alkane hydroxylase genes in microorganisms and environments
AlkB and CYP153 are important alkane hydroxylases responsible for aerobic alkane degradation in bioremediation of oil-polluted environments and microbial enhanced oil recovery. Since their distribution in nature is not clear, we made the investigation among thus-far sequenced 3,979 microbial genomes and 137 metagenomes from terrestrial, freshwater, and marine environments. Hundreds of diverse a...
متن کاملEvaluation of Essential Care Skills for Nurses Working at the Selected Infertility Clinics in Tehran, Iran, within 2016-2017: Nurses' Perspectives
Advancements in assisted reproductive technology (ART) has increased the nurses' contribution to the provision of ART services. The present descriptive cross-sectional study aimed to determine essential care skills for nurses working at the selected infertility clinics in Tehran, Iran, based on their perspectives within 2016-2017. A total of 59 nurses were selected via a convenience sampling me...
متن کاملExploration of Noncoding Sequences in Metagenomes
Environment-dependent genomic features have been defined for different metagenomes, whose genes and their associated processes are related to specific environments. Identification of ORFs and their functional categories are the most common methods for association between functional and environmental features. However, this analysis based on finding ORFs misses noncoding sequences and, therefore...
متن کاملMetagenomics for biotechnology
CAMERA – Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis http://camera.calit2.net/ CMAERA is a tool to aid people using metagenomics to study microbial community ecology. A major emphasis is on marine microbial ecosystems. Genomes on Line: Metagenomes http://www.genomesonline.org/gold.cgi?want= Metagenomes The Genomes on line database (GOLD) is an excel...
متن کامل